Conclusion:

What is Spark?

A fast and general engine for large-scale data processing

  • A Open-source cluster computing framework
  • End-to-End Analytics platform
  • Developed to overcome limitations of Hadoop/Map Reduce
  • Runs from a single desktop or a huge cluster
  • Iterative, interactive or stream processing
  • Supports multiple languages –Scala, Python, R, Java
  • Major companies like Amazon, eBay, Yahoo use Spark.

Advantages of Spark

  • A fast-growing Open Source engine
  • Many times faster than map-reduce
    • Keeps data in memory
  • Runs alongside other Hadoop components
  • Support for many programming languages
    • Scala, R, python, Java, piping
    • Same functionality across multiple languages
  • Multiple options and libraries –Graph, SQL, ML, Streaming
  • Works with multiple management frameworks

results matching ""

    No results matching ""